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METHOD AND APPARATUS FOR EFFECTIVE START UP OF TCP CONNECTIONS 

OVER HIGH SPEED, LONG PROPAGATION PATHS 

5 

BACKGROUND OF THE INVENTION 

TCP is a reliable data transfer protocol used widely over 
the Internet for numerous applications, from FTP to HTTP. The 
current implementation of TCP Reno/NewReno mainly includes two 
10 phases: Slow-start and Congest ion- avoidance . In the Slow-start 
phase, a sender opens the congestion window (cwnd) exponentially, 
doubling cwnd every Round-Trip Time (RTT) until it reaches the 
Slow-start Threshold (ssthresh) . The connection switches then to 
Congest ion- avoidance, where cwnd grows more conservatively, by 
15 only 1 packet every RTT (or linearly) . The initial ssthresh is 
set to an arbitrary default value, ranging from 4K to 64K Bytes, 
depending on the operating system implementation. 

By setting the initial ssthresh to an arbitrary value, TCP 
performance may suffer from two potential problems: (a) if 
20 ssthresh is set too high relative to the network Bandwidth Delay 
Product (BDP) , . the exponential increase of cwnd generates too 
many packets too fast, causing multiple losses at the bottleneck 
router and coarse timeouts, with significant reduction of. the 
connection throughput; (b) if the initial ssthresh is set. low . 
25 relative to BDP, the connection exits Slow-start and switches to 
linear cwnd increase prematurely, resulting in poor startup 
utilization especially when BDP is large . 

Recent studies reveal that a majority of the TCP connections 
are short-lived (mice) , while a smaller number of long-lived 
3 0 connections carry most Internet traffic (elephants) . A 
short-lived connection usually terminates even before it reaches 
"steady state". That is, before cwnd grows to make good 
utilization of the path bandwidth. Thus, the startup stage can 
significantly affect the performance of the mice. In a large BDP 
3 5 network, with the current Slow- start scheme, it takes many RTTs 
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for a TCP connection to reach the ideal window (equal to BDP) . 
For example, in current Reno/NewReno implementation with initial 
ssthresh set to 32 Kbytes, a TCP connection takes about 100 sec 
to reach the ideal window over a path with a bottleneck bandwidth 
of 100 Mbps and RTT of 100ms. The utilization in the first 10 sec 
is a meager 5.97%. With the rapid development of the Internet and 
ever-growing BDP, a more efficient Slow- start mechanism is 
required to achieve good link-utilization. 

A variety of methods have been suggested to avoid multiple 
losses and achieve higher utilization during the startup phase. 
A larger initial cwnd, roughly 4K bytes, has been proposed. This 
could greatly speed up transfers with only a few packets. 
However, the improvement is still inadequate when BDP is very 
large, and the file to transfer is bigger than just a few 
packets . Fast start uses cached cwnd and ssthresh in recent 
connections to reduce the transfer latency. The cached parameters 
may be too aggressive or too conservative when network conditions 
change . 

Smooth start [WXRS] has been proposed to slow down cwnd 
increase when it is close to ssthresh. The assumption here is 
that default value of ssthresh is often larger than the BDP, 
which is no longer true in large bandwidth delay networks . In one 
proposed solution, the initial ssthresh is set to the BDP 
estimated using packet pair measurements. This method .can be too 
aggressive. In another proposed method, termed Shared Passive 
Network Discovery (SPAND) , has been proposed to derive optimal 
TCP initial parameters. SPAND needs leaky bucket pacing for 
outgoing packets, which can be costly and problematic in 
■ practice. 

TCP Vegas detects congestion by comparing the achieved 
throughput over a cycle of length equal to RTT, to the expected 
throughput implied by cwnd and baseRTT (minimum RTT) at the 
beginning of a cycle. This method is applied in both Slow-start 
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and Congestion-avoidance phases . During Slow-start phase, a Vegas 
sender doubles its cwnd only every other RTT, in contrast with 
Reno's doubling every RTT. A Vegas connection exits slow-start 
when the difference between achieved and expected throughput 
exceeds a certain threshold. However, Vegas may not be able to 
achieve high utilization in large bandwidth delay networks 
because of its over-estimation of RTT. 



SUMMARY 

In one aspect of the invention, a method, herein termed 
Adaptive Start (As tart) is used at start up, or after a timeout 
occurs. In Astart, when a connection initially begins or 

15 re-starts after a coarse timeout, Astart adapt ively and 
repeatedly resets' the TCP Slow-start Threshold (ssthresh) based 
on the Eligible Rate Estimation (ERE) , as calculated in TCP 
Westwood (TCPW) 1 . Using ERE provides the means for adapting to 
network conditions during the startup phase. Thus a sender is 

2 0 able to grow the congestion window (cwnd) quickly without 
incurring risk of buffer overflow, and multiple losses. AStart can 
significantly improve link utilization under various bandwidth, 
buffer size and round-trip propagation times. Most importantly, 
the method avoids both link under-utilization due to premature 

25 Slow-start termination, as well as multiple losses due to 
initially setting ssthresh too high, or increasing cwnd faster 
than apprQpriate. 

In TCPW, a sender calculates ERE according to our previously 
disclosed inventions (BE, CRB, or ABSE) , and then uses ERE during 

30 the congestion avoidance phase of TCP as follows: 

if (3 DUPACKS are received) 

ssthresh = (ERE*RTTmin) /seg_ size; 
if (cwnd >ssthresh) /^congestion avoid*/ 
35 cwnd=ssthresh; 
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endif 
endif 

5 if (coarse timeout expires) 

cwnd = 1; 

ssthresh = (ERE *RTTmin) /seg_size; 
if (ssthresh < 2) 
ssthresh = 2 ; 

10 endif 
endif 



15 



20 



in As tart, a sender calculates ERE and uses ERE during start-up 
or after a Timeout as follows: 



if (3 DUPACKS are received) 

switch to congestion avoidance phase; 
else (ACK is received) 

if (ssthresh < (ERE*RTTmin) /seg_size) 

ssthresh = (ERE*RTTmin) /seg_size; 

endif . 
if (cwnd >ssthresh) /*mini congestion avoid, phase / 

increase cwnd by l/RTT; 
else if cwnd <ssthresh) /*mini slow start phase*/ 
25 increase cwnd by 1; 

endif 
endif 

This mode of operation can be extended to the entire 

. . _n -*^r-> -i mat- rannnm 
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lifetime of the connection, thus protecting also against random 
errors and sudden increases of bottleneck bandwidth, as may occur 



with nomadic users. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

These and other features, aspects, and advantages of the 
5 present invention will become better understood with regard to 
the following description, accompanying drawings, and attached 

appendices where: 

FIG. 1 is a block diagram of a computing device suitable for 
hosting a TCP process in accordance with an exemplary embodiment 
10 of the present invention; 

Appendix A is a publication describing a first method of 
calculating an ERE in accordance with an exemplary embodiment of 

» 

the present invention; 

Appendix B is a publication describing a second method of 
15 calculating an ERE in accordance with an exemplary embodiment of 
the present invention ; 

Appendix C is a publication describing a third method of 
calculating an ERE in accordance with an exemplary embodiment of 
the present invention; and 
20 Appendix D is an abbreviated presentation of the present 

invention . 

DETAILED DESCRIPTION 

In TCP Westwood (TCPW) , the sender continuously monitors 
25 ACKs from the receiver . and computes its current Eligible Rate 
Estimate (ERE) . ERE relies on an adaptive estimation technique 
applied to ACK stream. The goal of ERE is to estimate the 
eligible sending rate for a connection, and thus achieving high 
utilization without starving other connections. Research on 

30 active network estimation reveals that samples obtained using 
packet pair often reflects physical bandwidth, while samples 
obtained using a long packet train gives short -time throughput 
estimates. Mot having the luxury to estimate using active probing 
packets, a TCPW sender carefully chooses sampling intervals and 

35 filtering techniques to estimate the eligible bandwidth share of 
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a connection. DUPACKs and delayed ACKs are also properly counted 

in ERE computation, 
5 In current TCPW implementation, upon packet loss ('indicated 

by 3 DUPACKs or a timeout) the sender sets cwnd and ssthresh 
based on the current ERE • TCPW uses the following algorithm to 
set cwnd and ssthresh. 
if (3 DUPACKS are received) 
10 ssthresh = (ERE*RTTmin) /seg_size; 

if (cwnd >ssthresh) . ^congestion avoid*/ 

cwnd=ssthresh ; 
endif 

m 

endif 

15 if (coarse timeout expires) 

cwnd ==1; 

ssthresh = (ERE *RTTmin) /seg_size ; 
if (ssthresh < 2) . 
ssthresh = 2 ; 

20 endif 
endif 

Adaptive Start (Astart) , improves TCP startup performance . 
AStart takes advantage of the Eligible Rate Estimation (ERE) 
mechanism proposed in TCPW and adaptively and repeatedly resets 

25 ssthresh during the slow-start phase. When ERE indicates that 
there is more available capacity, the connection opens its cwnd 
faster, enduring better utilization. On the other hand, when ERE 
indicates that the connection is close to steady state, it 
switches to Congestion-avoidance, limiting the risk of buffer 

30 overflow and multiple losses. As such, AStart significantly 
enhances performance of TCP connections, and enhancement 
increases as BDP increases. When BDP reaches around 750 packets, 
the throughput improvement is an order of magnitude higher than 
that of TCP Reno/NewReno for short-lived connections. 
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AStart is a sender-side only modif icationto the traditional 
Reno/NewReno slow start algorithm. The TCPW eligible rate 
5 estimate is used to adaptively and repeatedly reset ssthresh 
during the startup phase, both connection startup, and after 
every coarse timeout. The pseudo code of the algorithm is as 
follows. When an ACK arrives: 
if (3 DUPACKS are received) 
!0 switch to congestion avoidance phase; 

else (ACK is received) 

if (ssthresh < (ERE*RTTmin) /seg_size) 

ssthresh = (ERE*RTTmin) /seg_size ,- 

endi f 

if (cwnd >ssthresh) /*mini congestion avoid, phase*/ 
increase cwnd by l/RTT; 
else if cwnd <ssthresh) /*mini slow start phase*/ 
increase cwnd by 1; 

endif 
20 endif 

In TCPW, an eligible rate estimate is determined after every 
ACK reception. In Astart, when the current ssthresh is much lower 
than ERE, the sender resets ssthresh higher accordingly, and 
increases cwnd in slow- start fashion. Otherwise, cwnd increases 
linearly to avoid overflow. In this way, Astart probes the 
available network bandwidth for this connection, and allows the 
connection to eventually exit Slow-start close to the ideal 
window. Compared to Vegas, TCPW avoids premature exit of slow 
start since it relies on both RTT and ACK intervals, while Vegas 

30 only relies on RTT estimates. 

By applying Astart, the sender does not overflow the 
bottleneck buffer and thus multiple losses are avoided. In 
effect, Astart consists of multiple mini-slow-start and 
mini-congestion-avoidance phases. Thus, cwnd does not increase 
35 as quickly as other methods, especially as cwnd approaches BDP. 
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This prevents the temporary queue from building up too fast, and 
thus, prevents a sender from overflowing a small buffer. In 

5 Astart, cwnd increase follows a smoother curve when it is close 
to BDP. In the case of a plurality of connections, each 
connection is able to estimate its share of bandw.idth and switch 
to Congest ion- avoidance at the appropriate time. In addition, 
Astart has a more appropriate (lower) slow- start exit cwnd, 

10 thanks to the continuous estimation mechanism, which reacts to 
the new traffic and determines an eligible sending rate that is 
no longer the entire bottleneck link capacity. 

FIG. 1 is a block diagram of a computing device suitable for 
hosting a transport protocol control process in accordance with 

15 an exemplary embodiment of the present invention. A host 500 

* 

. includes a processor 502 coupled via a bus 504 to a memory device 
506, a storage device controller 508, and a network device 
controller 510. The processor uses the network device controller 
to control the operations of a network device 512 which is 
20 adapted for communications using a transport protocol to transmit 
data to a receiver 514 across a connection 516 through a computer 
network 518 . 

The storage controller is coupled to a storage device 520 
having a computer readable storage medium for storage of program 

25 instructions 522 executable by the processor. The program 
instructions are stored in the storage device until the processor 
retrieves the program instructions and stores them in the memory. 
The processor then executes the program instructions stored in 
memory to implement the transport protocol control process as 

30 previously described. 

Although this invention has been described in certain 
specific embodiments, many additional modifications and 
variations would be apparent to those skilled in the art. It is 
therefore to be understood that this invention may be practiced 

35 otherwise than as specifically described. Thus, the present 
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embodiments of the invention should be considered in all respects 
as illustrative and not restrictive, the scope of the invention 
to be determined by claims supported by this application and the 
claims' equivalents rather than the foregoing. description. 
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ABSTRACT OF THE DISCLOSURE 

A modified TCP slow-start mechanism. When a connection 
initially begins or re- starts after a coarse timeout , modified 
TCP slow-start mechanism, called Adaptive Start (AStart) , 
adapt ively and repeatedly resets the slow- start threshold 
(ssthresh) based on an eligible sending rate estimation mechanism 
proposed, in TCP Westwood. By adapting to network conditions 
during the startup phase, a sender is able to grow the congestion 
window (cwnd) quickly without incurring a risk of buffer overflow 
and multiple packet losses, thus improving link utilization under 
various bandwidth, buffer size and round- trip propagation times. 
The method avoids both under-utilization due to premature 
slow- start termination, as well as multiple packet losses due to 
initially setting ssthresh too high, or increasing cwnd too fast. 
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